docs: clarify remaining v0 references #26311

simon-mo · 2025-10-06T17:59:15Z

Summary

clarify the metrics design doc so the prometheus middleware note no longer references the legacy V0 engine migration
update the speculative decoding guide to state that draft-model support requires the V1 engine instead of pointing to the retired v0.10 release

Testing

not run (documentation changes only)

https://chatgpt.com/codex/tasks/task_e_68e3f11c47408329bf2324ac7b1ad7bf

gemini-code-assist

Code Review

This pull request provides a number of documentation updates to remove references to the legacy v0 engine and clarify concepts for the current v1 engine. The changes are well-executed across multiple files, improving the clarity and relevance of the documentation for users. The updates are consistent with the stated goals of the PR, and I have no further suggestions.

DarkLight1337 · 2025-10-06T18:23:46Z

docs/usage/v1_guide.md

-
-    We have started the process of deprecating V0. Please read [RFC #18571](gh-issue:18571) for more details.
-
 V1 is now enabled by default for all supported use cases, and we will gradually enable it for every use case we plan to support. Please share any feedback on [GitHub](https://github.com/vllm-project/vllm) or in the [vLLM Slack](https://inviter.co/vllm-slack).


Also update this paragraph?

DarkLight1337 · 2025-10-06T18:24:10Z

docs/usage/v1_guide.md

 | **Mamba Models**            | <nobr>🟢 (Mamba-2), 🟢 (Mamba-1)</nobr>                                            |
 | **Multimodal Models**       | <nobr>🟢 Functional</nobr>                                                         |

-vLLM V1 currently excludes model architectures with the `SupportsV0Only` protocol.


We should remove the V1 column from the Supported Models page and delete all models that don't support V1

simon-mo · 2025-10-06T18:02:48Z

docs/configuration/optimization.md

 Chunked prefill allows vLLM to process large prefills in smaller chunks and batch them together with decode requests. This feature helps improve both throughput and latency by better balancing compute-bound (prefill) and memory-bound (decode) operations.

-In vLLM V1, **chunked prefill is always enabled by default**. This is different from vLLM V0, where it was conditionally enabled based on model characteristics.
+In vLLM V1, **chunked prefill is always enabled by default** so that behavior is consistent across supported models.


Suggested change

In vLLM V1, **chunked prefill is always enabled by default** so that behavior is consistent across supported models.

In vLLM V1, **chunked prefill is always enabled by default**.

simon-mo · 2025-10-06T20:19:45Z

docs/design/metrics.md

There are probably some mistakes here. @markmc PTAL

simon-mo · 2025-10-06T20:20:30Z

docs/design/multiprocessing.md

@njhill I guess this page can use a full clean up

simon-mo · 2025-10-06T20:21:53Z

docs/features/spec_decode.md

+    Speculative decoding with a draft model requires the V1 engine.
+    Older releases that predate V1 (such as the 0.10.x series) raise a `NotImplementedError`.


Suggested change

Speculative decoding with a draft model requires the V1 engine.

Older releases that predate V1 (such as the 0.10.x series) raise a `NotImplementedError`.

Speculative decoding with a draft model is not supported in vLLM V1 version.

You can use older version before the 0.10x series to continue to leverage it.

simon-mo · 2025-10-06T20:22:08Z

docs/models/supported_models.md

@DarkLight1337 PTAL

We should remove the V1 column from the Supported Models page and delete all models that don't support V1

LGTM after doing this

simon-mo · 2025-10-06T20:22:47Z

docs/usage/v1_guide.md

We can probably gradually remove this docs

docs: clarify remaining v0 references

944913c

simon-mo added the codex label Oct 6, 2025 — with ChatGPT Codex Connector

mergify bot added the documentation Improvements or additions to documentation label Oct 6, 2025

gemini-code-assist bot reviewed Oct 6, 2025

View reviewed changes

DarkLight1337 reviewed Oct 6, 2025

View reviewed changes

simon-mo commented Oct 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

docs: clarify remaining v0 references #26311

docs: clarify remaining v0 references #26311

simon-mo commented Oct 6, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

DarkLight1337 Oct 6, 2025

Uh oh!

DarkLight1337 Oct 6, 2025 •

edited

Loading

Uh oh!

simon-mo Oct 6, 2025

Uh oh!

simon-mo Oct 6, 2025

Uh oh!

simon-mo Oct 6, 2025

Uh oh!

simon-mo Oct 6, 2025

Uh oh!

simon-mo Oct 6, 2025

Uh oh!

DarkLight1337 Oct 7, 2025

Uh oh!

simon-mo Oct 6, 2025

Uh oh!

Uh oh!


		We have started the process of deprecating V0. Please read [RFC #18571](gh-issue:18571) for more details.

		V1 is now enabled by default for all supported use cases, and we will gradually enable it for every use case we plan to support. Please share any feedback on [GitHub](https://github.com/vllm-project/vllm) or in the [vLLM Slack](https://inviter.co/vllm-slack).

	In vLLM V1, chunked prefill is always enabled by default so that behavior is consistent across supported models.
	In vLLM V1, chunked prefill is always enabled by default.

		Speculative decoding with a draft model requires the V1 engine.
		Older releases that predate V1 (such as the 0.10.x series) raise a `NotImplementedError`.

Uh oh!

docs: clarify remaining v0 references #26311

Are you sure you want to change the base?

docs: clarify remaining v0 references #26311

Conversation

simon-mo commented Oct 6, 2025

Summary

Testing

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Oct 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

DarkLight1337 Oct 6, 2025 •

edited

Loading